62 research outputs found

    Waiting for Q: An Exploration of QAnon Users' Online Migration to Poal in the Wake of Voat's Demise

    Full text link
    Controversial and hateful online communities often face bans on mainstream social networks as part of moderation efforts. Reddit has implemented such measures by banning various communities and users in recent years. Although banning these communities eliminates the problem on one platform, the dispersal of users to other social networks, often with more lenient policies and moderation, remains a challenge. Voat, a Reddit-like platform, provided a haven for sharing controversial and hateful discourse that experienced significant growth and popularity with every Reddit ban until its eventual shutdown in December 2020. This study examines the migration patterns of QAnon adherents across Reddit, Voat, and Poal following the sudden Reddit bans and the announced Voat shutdown. We perform a large-scale measurement of the migrations, including Poal's influx following the Voat shutdown, which has not been studied before under this lens. We observe, among other things, that Reddit users suspected the imminent bans, so they suggested Voat as an alternative. However, after their most used subreddit, r/CBTS_Stream, was banned, users moved to r/greatawakening instead. It was only after the banning of both subreddits that users migrated to Voat. Similarly, Voat users proposed Poal as an alternative after Voat announced its shutdown, but their migration primarily occurred on the last day of Voat, resulting in approximately half of Voat's QAnon user base moving to Poal. This represents a significantly higher percentage compared to the migration from Reddit to Voat (9.7%), indicating that a warning with little time left, can aid towards an effective coordination among the community users. Lastly, our research uncovers evidence suggesting that discussions and planning related to the January 6th attack on the US Capitol emerged predominantly among Poal users, many of whom were Voat migrants

    You Shall Not Pass! Measuring, Predicting, and Detecting Malware Behavior

    Get PDF
    Researchers have been fighting malicious behavior on the Internet for several decades. The arms race is far from being close to an end, but this PhD work is intended to be another step towards the goal of making the Internet a safer place. My PhD has focused on measuring, predicting, and detecting malicious behavior on the Internet; we focused our efforts towards three different paths: establishing causality relations into malicious actions, predicting the actions taken by an attacker, and detecting malicious software. This work tried to understand the causes of malicious behavior in different scenarios (sandboxing, web browsing), by applying a novel statistical framework and statistical tests to determine what triggers malware. We also used deep learning algorithms to predict what actions an attacker would perform, with the goal of anticipating and countering the attacker’s moves. Moreover, we worked on malware detection for Android, by modeling sequences of API with Markov Chains and applying machine learning algorithms to classify benign and malicious apps. The methodology, design, and results of our research are relevant state of the art in the field; we will go through the different contributions that we worked on during my PhD to explain the design choices, the statistical methods and the takeaways characterizing them. We will show how these systems have an impact on current tools development and future research trends

    A family of droids -- Android malware detection via behavioral modeling: static vs dynamic analysis

    Full text link
    Following the increasing popularity of mobile ecosystems, cybercriminals have increasingly targeted them, designing and distributing malicious apps that steal information or cause harm to the device's owner. Aiming to counter them, detection techniques based on either static or dynamic analysis that model Android malware, have been proposed. While the pros and cons of these analysis techniques are known, they are usually compared in the context of their limitations e.g., static analysis is not able to capture runtime behaviors, full code coverage is usually not achieved during dynamic analysis, etc. Whereas, in this paper, we analyze the performance of static and dynamic analysis methods in the detection of Android malware and attempt to compare them in terms of their detection performance, using the same modeling approach. To this end, we build on MaMaDroid, a state-of-the-art detection system that relies on static analysis to create a behavioral model from the sequences of abstracted API calls. Then, aiming to apply the same technique in a dynamic analysis setting, we modify CHIMP, a platform recently proposed to crowdsource human inputs for app testing, in order to extract API calls' sequences from the traces produced while executing the app on a CHIMP virtual device. We call this system AuntieDroid and instantiate it by using both automated (Monkey) and user-generated inputs. We find that combining both static and dynamic analysis yields the best performance, with F-measure reaching 0.92. We also show that static analysis is at least as effective as dynamic analysis, depending on how apps are stimulated during execution, and, finally, investigate the reasons for inconsistent misclassifications across methods.Accepted manuscrip

    The Cause of All Evils: Assessing Causality Between User Actions and Malware Activity

    Get PDF
    Malware samples are created at a pace that makes it difficult for analysis to keep up. When analyzing an unknown malware sample, it is important to assess its capabilities to determine how much damage it can make to its victims, and perform prioritization decisions on which threats should be dealt with first. In a corporate environment, for example, a malware infection that is able to steal financial information is much more critical than one that is sending email spam, and should be dealt with the highest priority. In this paper we present a statistical approach able to determine causality relations between a specific trigger action (e.g., a user visiting a certain website in the browser) and a malware sample. We show that we can learn the typology of a malware sample by presenting it with a number of trigger actions commonly performed by users, and studying to which events the malware reacts. We show that our approach is able to correctly infer causality relations between information stealing malware and login events on websites, as well as between adware and websites containing advertisements

    A Systematic Literature Review of the Use of Computational Text Analysis Methods in Intimate Partner Violence Research

    Get PDF
    Purpose: Computational text mining methods are proposed as a useful methodological innovation in Intimate Partner Violence (IPV) research. Text mining can offer researchers access to existing or new datasets, sourced from social media or from IPV-related organisations, that would be too large to analyse manually. This article aims to give an overview of current work applying text mining methodologies in the study of IPV, as a starting point for researchers wanting to use such methods in their own work. Methods This article reports the results of a systematic review of academic research using computational text mining to research IPV. A review protocol was developed according to PRISMA guidelines, and a literature search of 8 databases was conducted, identifying 22 unique studies that were included in the review. Results: The included studies cover a wide range of methodologies and outcomes. Supervised and unsupervised approaches are represented, including rule-based classification (n = 3), traditional Machine Learning (n = 8), Deep Learning (n = 6) and topic modelling (n = 4) methods. Datasets are mostly sourced from social media (n = 15), with other data being sourced from police forces (n = 3), health or social care providers (n = 3), or litigation texts (n = 1). Evaluation methods mostly used a held-out, labelled test set, or k-fold Cross Validation, with Accuracy and F1 metrics reported. Only a few studies commented on the ethics of computational IPV research. Conclusions: Text mining methodologies offer promising data collection and analysis techniques for IPV research. Future work in this space must consider ethical implications of computational approaches

    The Ethics of Going Deep: Challenges in Machine Learning for Sensitive Security Domains

    Get PDF
    Sometimes, machine learning models can determine the trajectory of human life, and a series of cascading ethical failures could be irreversible. Ethical concerns are nevertheless set to increase, in particular when the injection of algorithmic forms of decision-making occurs in highly sensitive security contexts. In cybercrime, there have been cases of algorithms that have not identified racist and hateful speeches, as well as missing the identification of Image Based Sexual Abuse cases. Hence, this paper intends to add a voice of caution on the vulnerabilities pervading the different stages of a machine learning development pipeline and the ethical challenges that these potentially nurture and perpetuate. To highlight both the issues and potential fixes in an adversarial environment, we use Child Sexual Exploitation and its implications on the Internet as a case study, being 2021 its worst year according to the Internet Watch Foundation

    Understanding and preventing the advertisement and sale of illicit drugs to young people through social media: A multidisciplinary scoping review

    Get PDF
    ISSUES: The sale of illicit drugs online has expanded to mainstream social media apps. These platforms provide access to a wide audience, especially children and adolescents. Research is in its infancy and scattered due to the multidisciplinary aspects of the phenomena. APPROACH: We present a multidisciplinary systematic scoping review on the advertisement and sale of illicit drugs to young people. Peer-reviewed studies written in English, Spanish and French were searched for the period 2015 to 2022. We extracted data on users, drugs studied, rate of posts, terminology used and study methodology. KEY FINDINGS: A total of 56 peer-reviewed papers were included. The analysis of these highlights the variety of drugs advertised and platforms used to do so. Various methodological designs were considered. Approaches to detecting illicit content were the focus of many studies as algorithms move from detecting drug-related keywords to drug selling behaviour. We found that on average, for the studies reviewed, 13 in 100 social media posts advertise illicit drugs. However, popular platforms used by adolescents are rarely studied. IMPLICATIONS: Promotional content is increasing in sophistication to appeal to young people, shifting towards healthy, glamourous and seemingly legal depictions of drugs. Greater inter-disciplinary collaboration between computational and qualitative approaches are needed to comprehensively study the sale and advertisement of illegal drugs on social media across different platforms. This requires coordinated action from researchers, policy makers and service providers

    Cerberus: Exploring Federated Prediction of Security Events

    Full text link
    Modern defenses against cyberattacks increasingly rely on proactive approaches, e.g., to predict the adversary's next actions based on past events. Building accurate prediction models requires knowledge from many organizations; alas, this entails disclosing sensitive information, such as network structures, security postures, and policies, which might often be undesirable or outright impossible. In this paper, we explore the feasibility of using Federated Learning (FL) to predict future security events. To this end, we introduce Cerberus, a system enabling collaborative training of Recurrent Neural Network (RNN) models for participating organizations. The intuition is that FL could potentially offer a middle-ground between the non-private approach where the training data is pooled at a central server and the low-utility alternative of only training local models. We instantiate Cerberus on a dataset obtained from a major security company's intrusion prevention product and evaluate it vis-a-vis utility, robustness, and privacy, as well as how participants contribute to and benefit from the system. Overall, our work sheds light on both the positive aspects and the challenges of using FL for this task and paves the way for deploying federated approaches to predictive security

    A Family of Droids -- Android Malware Detection via Behavioral Modeling: Static vs Dynamic Analysis

    Get PDF
    Following the increasing popularity of mobile ecosystems, cybercriminals have increasingly targeted them, designing and distributing malicious apps that steal information or cause harm to the device's owner. Aiming to counter them, detection techniques based on either static or dynamic analysis that model Android malware, have been proposed. While the pros and cons of these analysis techniques are known, they are usually compared in the context of their limitations e.g., static analysis is not able to capture runtime behaviors, full code coverage is usually not achieved during dynamic analysis, etc. Whereas, in this paper, we analyze the performance of static and dynamic analysis methods in the detection of Android malware and attempt to compare them in terms of their detection performance, using the same modeling approach. To this end, we build on MaMaDroid, a state-of-the-art detection system that relies on static analysis to create a behavioral model from the sequences of abstracted API calls. Then, aiming to apply the same technique in a dynamic analysis setting, we modify CHIMP, a platform recently proposed to crowdsource human inputs for app testing, in order to extract API calls' sequences from the traces produced while executing the app on a CHIMP virtual device. We call this system AuntieDroid and instantiate it by using both automated (Monkey) and user-generated inputs. We find that combining both static and dynamic analysis yields the best performance, with F-measure reaching 0.92. We also show that static analysis is at least as effective as dynamic analysis, depending on how apps are stimulated during execution, and, finally, investigate the reasons for inconsistent misclassifications across methods.Comment: A preliminary version of this paper appears in the Proceedings of 16th Annual Conference on Privacy, Security and Trust (PST 2018). This is the full versio
    • 

    corecore